Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction
نویسندگان
چکیده
The problems of object class segmentation [2], which assigns an object label such as road or building to every pixel in the image and dense stereo reconstruction, in which every pixel within an image is labelled with a disparity [1], are well suited for being solved jointly. Both approaches formulate the problem of providing a correct labelling of an image as one of Maximum a Posteriori (MAP) estimation over a Conditional Random Field (CRF). Both may use graph cut based move making algorithms to solve the labelling problem. The correct labelling of object class can inform depth labelling and stereo reconstruction can also improve object labelling. Similarly object class boundaries are more likely to occur at a sudden transition in depth and vice versa. Moreover, the height of a point above the ground plane is an extremely informative cue regarding its class label, and can be computed from the depth. For example road or sidewalk lie in the ground plane, and pixels taking labels pedestrian or car must lie above the ground plane, while pixels taking label sky must occur at an infinite depth. Our joint optimisation consists of two parts, object class segmentation and dense stereo reconstruction. We follow [2] in formulating the problem of object class segmentation as finding a minimal cost labelling of a CRF defined over a set of random variables X = {X1, . . . ,XN} each taking a state from the label space L = {l1, l2, . . . , lk}. Each label l j indicates a different object class such as car, road, building or sky. For the dense stereo reconstruction part of our joint formulation we use the energy formulation of [1], who formulated the problem as one of finding a minimal cost labelling of a CRF defined over a set of random variables Y = {Y1, . . . ,YN}, where each variable Yi takes a state from the label space D = {d1,d2, . . . ,dm} corresponding to a set of disparities. We formulate simultaneous object class segmentation and dense stereo reconstruction as an energy minimisation of a dense labelling z over the image. Each random variable Zi = [Xi,Yi] takes a label zi = [xi,yi], from the product space of object class and disparity labels L ×D that correspond to the variable Zi taking object label xi and disparity yi. In general the energy of the CRF for joint estimation can be written as:
منابع مشابه
Joint Object Pose Estimation and Shape Reconstruction in Urban Street Scenes Using 3D Shape Priors
Estimating the pose and 3D shape of a large variety of instances within an object class from stereo images is a challenging problem, especially in realistic conditions such as urban street scenes. We propose a novel approach for using compact shape manifolds of the shape within an object class for object segmentation, pose and shape estimation. Our method first detects objects and estimates the...
متن کامل3DFS: Deformable Dense Depth Fusion and Segmentation for Object Reconstruction from a Handheld Camera
We propose an approach for 3D reconstruction and segmentation of a single object placed on a flat surface from an input video. Our approach is to perform dense depth map estimation for multiple views using a proposed objective function that preserves detail. The resulting depth maps are then fused using a proposed implicit surface function that is robust to estimation error, producing a smooth ...
متن کاملReal-time Depth Reconstruction from Stereo Sequences
We propose a fast depth reconstruction algorithm for stereo sequences using camera geometry and disparity estimation. In disparity estimation process, we calculate dense background disparity fields in an initialization step so that only disparities of moving object regions are updated in the main process using real-time segmentation and hierarchical disparity estimation techniques. The estimate...
متن کاملActive Stereo-matching for One-shot Dense Reconstruction
Stereo-vision in computer vision represents an important field for 3D reconstruction. Real time dense reconstruction, however, is only achieved for high textured surfaces in passive stereo-matching. In this work an active stereo-matching approach is proposed. A projected pattern is used to artificially increase the texture of the measuring object, thus enabling dense reconstruction for one-shot...
متن کاملGlobal structured models towards scene understanding
Many scene understanding tasks are formulated as a labelling problem that tries to assign a label to each pixel of an image. These discrete labels may vary depending on the task, for example they may correspond to di erent object classes such as car, grass or sky, or to depths or to intensity after denoising. These labelling problems are typically formulated as a pairwise Markov or Conditional ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010